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Abstract 


Cancer is a deadly disease that affects humans of all races, gender, and age. 
The matrix metalloproteinase-2 (MMP-2) protein is a good target when 
designing an anticancer drug. The expression of this protein influences cell 
growth and division. The activation of this protein opens the extracellular 
matrix and provides entry of the new cells into the body system. 
Phytochemicals are known to possess the ability to cure human diseases 
with little or no side effect. In this study, phytochemicals from yellow 
mombin, turmeric, green chiretta, African basil and ginger were evaluated 
against the MMP-2 orthosteric sites and three-dimensional quantitative 
structure activity relationship (3D-QSAR) was used to generate a model for 
MMP-2 inhibitors. The drug-like properties of the lead compounds and the 
standard drug were tested by employing the Lipinski rule of five. Azulene 
from ginger, Andrographidine A from green chiretta and Isovitexin from 
African basil with the docking scores of -7.3 kcal/mol, -9.3 kcal/mol, and - 
8.2 kcal/mol, respectively, were found to be the lead compounds as 
potential MMP-2 inhibitors. A robust regression model for the inhibition 
of MMP-2 was generated. Andrographidine A with the highest docking 
score stood out as a potential inhibitor of MMP-2 by sharing selective 
interactions with his-120 and his-130. The QSAR model proposed herein 
was thoroughly validated and hence offers a tool for the identification of 
potential MMP-2 inhibitors in the future. 


er This work is licensed under the Creative Commons Attribution-Non- 
stam Commercial 4.0 International License. 





Introduction 


Cancer alters the normal properties of cells via 
different molecular events [1]. Normal control 
systems that prevent cell overgrowth and the 
invasion of other tissues are disabled in cancer cells 
[2]. Cancer is one of the deadliest diseases in both 
developed, and developing countries. Lung, breast, 
prostate and large bowel cancers are the most 
common types of cancer and are responsible for 
more than half of all cases [3]. Zinc-dependent 
proteases like matrix metalloproteinases (MMPs) or 
matrixins cleave and rebuild connective tissue 
components like elastin, collagen, gelatin and 
casein [4]. MMPs also degrade extracellular matrix 
during growth, morphogenesis and __ other 
developmental stages. Due to their physiological 
functions, MMPs have been reported for their high 
activity in diseases and pathological processes like 
inflammation and cancer [4]. High MMPs 
expression levels contribute to the development of 
cancer [5]. Due to the importance of MMPs in 
diseases, efforts have been made in the area of drug 
development to develop small molecule drugs that 
can inhibit MMPs. However, all efforts failed in 
clinical trials due to their low specificity, MMP 
inhibitors binding to Zn** and other heavy metals 
(active sites) in various proteins in the body, hence 
they are highly toxic [6, 7]. Also, drugs directed at 
multiple MMP family members evoked surprising 
effects because MMP activities are numerous. In 
fact, some MMPs have been reported for the roles 
they play as anti-tumorigenic agents [8, 9]; hence, 
there is a need to develop inhibitors that target only 
one or a narrow range of MMPs. Studies have 
repeatedly shown that consuming fruits and 
vegetables regularly helps greatly in reducing risks 
of developing chronic diseases such as cancer 
owing to the antioxidant activity of the 
phytochemicals present and are as well less toxic 
even after reacting with Zn**, which is responsible 
for activating MMP-2 [10]. Some traditional healers 
in Nigeria claim that they can successfully cure 
cancer using herbs [11]. 

In the present study, phytochemicals from 
Spondias mombin, Curcuma longa, Andrographis 
paniculata, Oscimum  gratisimum, and Zingiber 
officinale were screened for their inhibitory 
properties against MMP-2. The conformity of the 
lead compounds to Lipinski’s rule of five (ROS) 
was assessed. The accuracy of the docking results 
was validated by a coefficient correlation with 
reported MMP-2 inhibitors. In addition, 3- 
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dimensional quantitative structure-activity 
relationship (3D-QSAR) was used to generate a 
model for MMP-2 inhibitors, and the residues 
involved in amino acid interactions of the leads 
were determined. 


Materials and Methods 
Protein preparation for docking 


The MMP-2 protein with protein data bank (PDB) 
ID; lhov and crystallographic resolution of 2.50A 
(Fig. 1) [12] was downloaded from the protein data 
bank (http://www.rcsb.org). Employing PyMOL 
Autodock/Vina Plugin, the co-crystallized ligand 
from the MMP-2 _ protein’ was_ extracted. 
Tanomastat, an anticancer drug that targets the 
MMP family was downloaded from the PubChem 
repository (https://pubchem.ncbi.nlm.nih.gov) and 
used as the standard drug (Fig. S1). 
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Fig. 1 Matrix metalloproteinase-2 (green) and a co-crystallized 
compound (red). The grey colored balls represent Ca while 
orange colored balls represent Zn. PDB ID: lho 


Ligand preparation for molecular docking 


A total of fifty-eight, fifty-five, thirty, fifty-three 
and fifty-five phytochemicals characterized from 
yellow mombin (Spondias mombin), turmeric 
(Curcuma longa), green chireta (Andrographis 
paniculate), African basil (Oscimum ratisimum) 
and ginger (Zingiber officinal), respectively, were 
downloaded in the structure-data file format (sdf) 
from the PubChem database 
(https://pubchem.ncbi.nlm.nih.gov). The 
phytochemicals were catenated and converted to 
pdb format using Open Babel and later to pdbat 
with the lig prep command lines in AutoDock Vina. 
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The phytochemicals (206) from the plants were 
docked into the active sites of MMP-2. 


Molecular properties and Lipinski’s rule of five 


The Mavin Viewer software was employed 
(www.chemaxon.com) in the present study to 
access the conformity of the lead compounds to the 
Lipinski’s rule of five (ROS). The ROS helps to 
determine the drug-ability and check the absorption, 
distribution, metabolism, and excretion (ADME) 
properties of the leads. The number of rotatable 
bonds and polar surface area, which help in 
differentiating orally active compounds from those 
that are not was also obtained for the leads [13]. 


Validation of the docking results 


Validation of the docking result was performed with 
the multiple alignments of the MMpP-receptor 
sequences obtained from PubMed. Using the online 
available ChemBL Database, the MMP-2 sequences 
were blasted on www.ebiac.uk/chembl/. The result 
produced with the identity of 100%, ICso (half 
maximal inhibitory concentration) value of 5000 
and KI value of 721, this was downloaded in text 
format and converted to sdf format with the Data 
Warrior version 2 (www.openmolecules.org). This 
sdf file was then converted to pdb and pdbat. A total 
of two hundred and ten compounds were obtained 
from the chemBI file and were docked into the 
MMP-2 catalytic site, as it was with the 
phytochemicals. A correlation graph was plotted 
with the pICso and docking scores obtained from 
docking the chemBIl’s compounds into the MMP-2 
catalytic site. The significance of correlation 
between the negative log of the ICso (pICso) and the 
docking scores was determined at P<0.05. 


Three-dimensional quantitative structure- 


activity relationship (3D-QSAR) 
Data collection and descriptor calculation 


The MMP-2 bioassay ICso was downloaded from 
the PubChem-ChemBI1 database, and saved in the 
excel format. The bioassay was converted to sdf 
form with DataWarrior, and the sdf structures were 
catenated and converted to their 3-dimensional 
structures using command lines. The chemistry 
development kit (CDK) 1.4.6 was used for the 
calculation of the molecular descriptors [14]. 


Data pre-treatment 


Pretreatment with the aim of removing the co- 
linearity of descriptors was carried out with the V- 
WSP algorithm [15]. The variance cut-off was set 
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at 0.0001, while the correlation coefficient cut-off 
was Set at 0.5. 


Data set division: training and test sets 


A dataset of 100 MMP-2 inhibitors was obtained 
from the chEMBL database (http://eb1.ac.uk). The 
data were split into training (70%) and test set 
(30%) using the Kennard Stone algorithm technique 
built into the Dataset Division GUI 1.2 [16]. 


Genetic algorithm and multiple linear regression 
analysis 


A genetic algorithm, a search heuristic system that 
mimics the natural selection process, was used to 
perform the selection of significant variables 
(descriptors). The individual descriptor was picked 
through a fitness function, which assesses each 
descriptor, and as a result of this fitness function, 
the best descriptors were picked. A training set of 
70 MMP-2 molecular structures was used as the 
training set and an equation length of ten descriptors 
(variables) was adopted. Multiple linear regression 
(MLR) and generation of the unbiased model 
equation was carried out with the R software for 
Statistic computing. 


Results and Discussion 


Molecular docking 


In the present study, phytochemicals from yellow 
mombin, turmeric, green chireta, African basil and 
ginger screened against the MMP-2 protein 
catalytic site revealed azulene from ginger, 
andrographidine A _ from _ green chireta and 
isovitexin from African basin with the docking 
scores of -7.3 kcal/mol, -9.3 kcal/mol, and -8.2 
kcal/mol, respectively, as the lead compounds 
(Table 1; Table S1-S5; Fig. $1). The docking score, 
(-9.2 kcal/mol) of tanomastat against the catalytic 
site of MMP-2 was used as the cut-off for the 
selection of the leads. Tanomastat possesses both 
antiangiogenic and antimetastatic properties [12]. 
It is worthy of note that the leads identified herein 
through molecular docking screening have all been 
reported elsewhere to have anticancer properties 


Table 1 The docking scores of lead phytocompounds from 
ginger, green chireta, and African basil and the standard drug. 


Docking score 


Plant Phytochemical (kcal/mol) 
Ginger Azulene -7.3 
Green chireta Andrographidine A -9,3 
African basil Isovitexin -8.2 
Standard drug §Tanomastat -9.2 
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Table 2 Lipinski’s physiochemical properties of the lead compounds. 


Docking scores 


Plants Hit compounds. eal 
Ginger Azulene -9.5 
Green chireta Andrographine A* -9.2 
African basil —_ Isovitexin** -10.2 


Standard dru Tanomastat* -9,.2 


HBA HBD RB XLOGP3 MW PSA 
<5 <10 <5 <500 140) 
0 0 ee 128.17 0 
4 6 0.7 426.45 144.14 
7 3 3 432.38 177.14 


1 8 5.5 410.91 54.37 


HBA = hydrogen bond acceptor; HBD = hydrogen bond donor; RB = rotatable bond; XlogP = octanol-water partition coefficient; MW = molecular 


weight; PSA = polar surface area 


* Those that disobeyed one of the Lipinski rules; ** Those that disobeyed two of the Lipinski rules. 


[17-19]. This clearly shows that this technique 
(molecular docking screening) can be applied for 
the identification of novel anti-cancer compounds. 


Lipinski rule of five (ROS) 


The ROS helps to evaluate drug-likeness or 
determine whether a chemical compound with a 
certain pharmacological or biological activity 
possesses properties that qualify it, as a likely orally 
active drug in humans [20]. According to Lipinsk1, 
an orally active drug must not violate more than one 
of these rules: (1) not more than 5 hydrogen bond 
donors, (2) not more than 10 hydrogen bond 
acceptors, (3) a molecular mass less than 500 
Daltons and (4) an_ octanol-water partition 
coefficient log-P not greater than 5. This also 
includes an additional rule proposed by Veber and 
coworkers [13], (5) not less than 10 rotatable bonds 
and polar surface area (PSA) less than 140. 
Andrographidine A disobeyed one of the Lipinski 
rules while isovitexin disobeyed two of the Lipinski 
rules as those have values that deviate a little from 
the Lipinski prescribed values (Table 2). Azulene 
and andrographidine A with just one violation of the 
rules are likely orally active phytocompounds. 
Isovitexin with two violations of the rules may not 
be an orally active drug (Table 2). However, there 
are reports of exceptions to Lipinski’s rule of five, 
mostly among natural products [21]. 


Validation of docking results 


The veracity of the docking results in the present 
study was validated by the coefficient correlation 
analysis of the docking results generated from 
docking 271 reported MMP-2 inhibitors from the 
ChembIl database against their corresponding pIC50o. 
There was a significant positive correlation (R? = 
0.459) between the docking score of MMP-2 
inhibitors and their corresponding experimentally 
derived pICso at P<0.001 (Fig. 2; Table 3). This is a 
revelation that computers can accurately predict 
experimental values and hence the docking scores 
obtained herein are correct and reliable. 


Quantitative structure-activity 
(QSAR) and regression Analysis 


relationship 


QSAR predicts the relationship that exists between 
the structure and activity of a compound. Seventy 
(70) MMP-2 inhibitors were used as the training set. 
The linear regression analyses of the training set 
were carried out with R software for statistical 
computing. The Pearson correlation (R) when all 
the ten (10) descriptors were used was 0.976 (Table 
S6). This represents a very strong correlation. The 
R? value of 0.952 shows that the model in the 
present study could predict a wanton 95% of the 
variation in the predicted pICso (The [Cso values 
were converted to pICso with the formula (pICs0 = - 
log ICso) that is accounted for by the ten (10) 
descriptors. The adjusted-R’ is concerned with how 
the model generalizes, which is the external 
validation of the model. The adjusted-R’ was close 
to the R’ value (the difference between the R’ value 
and the adjusted-R’ value was 0.044) (Table S6), 
this signifies that our model had experienced just a 
paltry of 4.4% shrinkage in predicting external 
pICso. The closeness of the adjusted-R’ value to the 
R’ value shows that the cross validity of the model 
is very good. 

Durbin-Watson statistics, 1.311976 (Table S6) 
informs that the assumption of independent error is 
tenable and the model is valid [22]. The Durbin- 
Watson statistics, as a conservative rule of thumb, 
predicts values less than | or greater than 3 causes 


Table 3 Coefficient correlation analysis of the docking score 
of matrix metalloproteinase-2 inhibitors and _ their 
corresponding experimentally derived pICso. 


Correlation coefficient 0.459** 
Sig. (2-tailed) <0.0001 
N 271 
Bootstrap 

Bias -0.002 
Std. error 0.048 
BCa 95% confidence interval 

Lower 0.368 
Upper 0.541 


** Significant correlation at P<0.001 
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Docking Scores 





pIC50 


Fig. 2 Coefficient correlation of docking scores of matrix 
metalloproteinase-2 inhibitors and their corresponding 
experimentally derived pICs0o. 


Frequency 





Fig. 3 Histogram plot of the differences between the observed 
and the predicted pICSO (residuals). 


plCso 





Predicted plCso value 


Fig. 4 Scattered plot of the observed pICs0 values against the 
predicted pICso values of the training set, the R? value of 0.872 
depicts our model accurately predicts over 87% of the 
observed pICs0 values, hence, the model is unbiased and valid 
(Table S8). 
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for concern [22]. The ANOVA table (Table S7) 
with an F-ratio (the ratio of the improvement in the 
prediction from the model in relation to the 
inaccuracy in the model) of 21.77 and significant at 
P<0.001, reveals that the model is significantly 
better at predicting the pICso from the training set 
when compared to when the model is not applied. 
The histogram plot in Fig. 3 shows the normality of 
the residuals (difference between the observed and 
the predicted pICso). The histogram is symmetrical 
and bell-shaped, this shows the residuals follow a 
normal distribution and hence the use of multiple 
linear regression. 


Generation of regression model equation 


Y= MX+C (1) 
Y = B+BIX1 + B2X2 + B3X3 + B4X4...BnXn (2) 


pICso= B+ BIXI] + B2X2 + B3X3 + B4X4 (3) 


pICso= (61.979) + (-0.001*MOMI-Z) + 
(0.366*MOMI-YZ) + (0.073*nSmallRings) + 
(0.991*MOMI-XY) +  (0.333*khs.sCH3) + 
(0.012*DPSA-3) + (2.118*ATSc2) + 
(0.8*Wlambda2.unity) + (0.064*NAtomLAC) + (- 
4.579*BCUTw-1]) (4) 


The Eq. 4 is the regression model equation. Where 
Y or plCso is the dependent variable, X the 
independent variable, Bo and B; are regression 
coefficients. 

MOMI-Z: moment of inertia along Z-axis, MOMI- 
YZ: moment of inertia along Y and Z-axis, 
nSmallRings: an enumeration of all the small rings 
(sizes 3 to 9) ina molecule, MOMI-XY: moment of 
inertia along X and Y axis, Khs.sCH3: descriptors 
that calculates Kier and Hall molecular indices, 
DPSA-3: difference of PPSA-1l and PNSA-1, 
ATSc2 (PaDEL;2D): ATS autocorrelation 
descriptor, weighted by charges, Wlambda2.unity: 
directional WHIM weighted by unit weights, 
NAtomLAC: returns the number of atoms in the 
longest aliphatic chain, BCUTw-1I: Eigen value- 
based descriptor. 


Determination of the Lead’s pICso using the 
derived QSAR model 


The pICso of the lead phytocompounds (Table S9), 
azulene, andrographidine A and isovitexin were 
predicted using the QSAR model generated in the 
present study (Eq. 4). Isovitexin possessed the 
highest pICso value of 10.52, followed by 
andrographidine A with pICso of 9.59, while 
azulene possessed the lowest pICso value of 8.70. It 
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Fig. 5 Molecular interactions of amino acid residues within the 
active site of matrix metalloproteinase-2 with the standard 
drug, tanomastat (green stick). The red dotted lines represent 
hydrophobic interactions, blue dotted lines represent hydrogen 
bond interactions, yellow dotted lines represent salt bridge and 
green dotted lines represent pi cation. 





Gsly-81 


Fig. 6 Molecular interactions of amino acid residues within the 
active site of matrix metalloproteinase-2 with isovetin (red 
stick). The blue dotted lines represent hydrogen bond 
interactions. 





Fig. 7 Molecular interactions of amino acid residues within the 
active site of matrix metalloproteinase-2 with azulene (blue 
stick). The red dotted lines represent hydrophobic interactions. 
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Fig. 8 Molecular interactions of amino acid residues within the 
active site of matrix metalloproteinase-2 with andrographidine 
A (orange stick). The red dotted lines represent hydrophobic 
interactions, blue dotted lines represent hydrogen bond 
interactions, yellow dotted lines represent salt bridge and green 
dotted lines represent pi cation. 


is worthy of note that the three lead compounds 
herein have been documented to possess anti-cancer 
properties [17-19]. 


Analysis of molecular interactions of the 
standard and the lead compounds 


The interactions with key amino acid residues in the 
catalytic site are of great importance while 
influencing the inhibitory activity of MMP 
inhibitors [23]. The molecular interactions of amino 
acid residues with the active site of MMP-2 are 
shown in Fig. 5, 6, 7, and 8 for the standard drug, 
isovitexin, azulene and  andrographidine A, 
respectively. Tanomastat, forms two hydrogen bond 
interactions with leu-150 and ala-136, it forms 
hydrophobic interactions with val-42, leu-83, phe- 
115, leu-116, ala-119, his-120, leu-137, tyr-142, 
thr-145, phe-148, leu-150, pi cations with his-120 
and salt bridge with arg-149 (Fig. 5). Isovitexin on 
the other hand forms only six hydrogen bond 
interactions, gly-73, gly-81, leu-83, his-124, and 
ala-139 within the MMP-2 catalytic site (Fig. 6). 
Azulene forms five hydrophobic interactions, leu- 
83, his-120, ala-84, tyr-142, val-117 (Fig. 7) while 
andrographidine A forms three hydrogen bond 
interactions with, ala-84, glu-121, ala-139, four 
hydrophobic interactions with asp-72, tyr-74, leu- 
82, his-85, one pi stacking with, his-85 and two salt 
bridge with his-120 and his-130 ((Fig. 8; Table 4). 
Tanomastat and Azulene share common 
hydrophobic interactions with tyr-142. Tanomasta, 
Azulene and Andrographidine A all share his-120. 
Leu-83 is common to tanomasta, isovitexin and 
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Table 4 Amino acid residues involved in molecular interactions of the standard drug (tanomastat) and the lead compounds within 


the active site of matrix metalloproteinase-2. 


Lead compounds Hydrogen bonds Hydrophobic interactions ; aca oe aie 
Tanomastat leu 150, ala-136 val-42, leu-83, phe-115, leu-116, ala-119, his-120 arg-149 
(Standard drug) his-120, eu-137, tyr-142, thr-145, phe- 

148, leu-150 
Isovitexin gly-73, gly-81, leu- 
83, his-124, ala-139 
Azulene leu-83, his-120, ala-84, tyr-142, val-117 
Andrographidine A __ala-84, glu-121, ala-  asp-72, tyr-74, leu-82, his-85 his-85 his-120, 
139 his-130 


azulene (Table 4). The active site of MMPs contains 
histidine residues (his-120 and his-130), which are 
bound to the Zn atom and are highly conserved in 
MMPs [24]. According to Agrawal and co-workers 
[25], the selectivity of MMP-2 inhibitor depends on 
the nature of the zinc-binding group (ZBG). As 
evident in the present study, andrographidine A 
shares selective interactions with his-120 and his- 
130 and hence a potential MMP-2 inhibitor. The 
docking score of andrographidine A, azulene and 
isovitexin 1s -9.3 kcal/mol, -7.3 kcal/mol and -8.2 
kcal/mol while that of the standard is -9.2 kcal/mol. 
The high docking score of andrographidine A is 
probably due to its extensive molecular interactions 
with key residues within the active site of MMP-2 
when compared to the other leads (isovitexin and 
azulene). The higher docking score of isovitexin 
when compare to azulene is probably due to the 
absolute hydrogen bond interactions of the former. 


Conclusion 


Phytochemicals have been shown to possess and 
demonstrate anti-tumor effects. With the advent of 
computer-aided drug design, it is now possible to 
harness the diverse phytocompounds and explore 
their anti-cancer properties 1n the designing of novel 
anti-cancer drugs. The present study reveals 
azulene, andrographidine A and _ isovitexin as 
potential MMP-2 inhibitors. The QSAR model 
proposed herein is thoroughly validated and hence a 
tool for the identification of potential MMP-2 leads. 
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